class: center, middle, inverse, title-slide # The challenge of everyday statistics in 30 minutes ### Peter Geelan-Small - Stats Central, UNSW ### 29th July, 2021 ---
<style type="text/css"> .remark-slide-content { font-size: 28px; padding: 1em 1em 1em 1em; } </style> # Background Statistics in research - May need statistics to get *information* from your data - Information includes relationship among system variables with associated uncertainty - Start with good design - fancy statistics can't usually fix holes in design - At start of study, make a statistical plan of how to analyse data - "To consult the statistician after an experiment is finished is often merely to ask [them] to conduct a post mortem examination. [They] can perhaps say what the experiment died of." (Ronald Fisher, 1938. *Sankhya* 4: 14-17) --- # Statistical plan - Research question `\(\;\rightarrow\;\)` objectives/hypotheses - Objectives/hypotheses framed in terms of - specific *outcome* and - possible explanatory or predictor variables associated with that outcome How do you know what statistical analysis methods to use? - Depends on: - study design - data structure, independent data or not, ... - type of outcome variable - quantitative (continuous, proportion, count), categorical (binary, ordered, nominal) --- # Models - are there really so many? - 1820s - linear regression based on normal distribution - Early 1900s - Pearson's `\(\chi^2\)` (chi-squared) test - 1908 - `\(t\)` test - 1920s or so - analysis of variance (ANOVA) and analysis of covariance (ANCOVA) - 1972 - generalised linear models - models based on normal, binomial, Poisson and other distributions unified --- # Models - are there really so many? Model names are an accident of history <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-2-1.png" style="display: block; margin: auto;" /> --- # Example data and analyses <img src="data:image/png;base64,#harris_title.png" width="4408" style="display: block; margin: auto;" /> Harris et al. (2017) https://doi.org/10.1371/journal.pone.0188233 --- # Two independent groups **Is knot-tying time associated with watching an expert demonstration of the technique?** *Outcome:* Time (s) to tie a knot - continuous *Predictor:* Training condition (2 groups - control, expert) *Data structure:* Independent *Model:* Two independent sample `\(t\)` test - normal distribution assumption `$$\mathrm{Time} = \beta_0 + \beta_1 \; \mathrm{Condition_{expert}}$$` --- # Two independent sample `\(t\)` test *Assumptions* - Independence - the two samples are independent - data values within each sample are independent - Constant variance - data values in each group have "same" variance - Normal distribution - data in each group is normally distributed --- # Two independent sample `\(t\)` test *Assumptions* - Judge if assumptions are satisfied using diagnostic plots - constant variance: box plot, residual vs. fitted value plot (often hard-wired in software) - normal quantile-quantile (Q-Q) plot - We recommend you do *not* do hypothesis tests on assumptions - Use hypothesis tests only for specific research questions --- # Two independent sample `\(t\)` test .pull-left[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" /> ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> ] Positive skew is evident --- # Two independent sample `\(t\)` test .pull-left[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-8-1.png" style="display: block; margin: auto;" /> ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-9-1.png" style="display: block; margin: auto;" /> ] Data deviates visibly from normal distribution --- # Two independent sample `\(t\)` test .pull-left[ - Skewness in outcome variable can distort model - Outlying points can exert undue influence - Log-transforming outcome variable may fix this - Log-transformation appears successful here ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-10-1.png" style="display: block; margin: auto;" /> ] --- # Two independent sample `\(t\)` test .pull-left[ - Variances appear equal ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-11-1.png" style="display: block; margin: auto;" /> ] --- # Two independent sample `\(t\)` test .pull-left[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-12-1.png" style="display: block; margin: auto;" /> ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-13-1.png" style="display: block; margin: auto;" /> ] Normal distribution assumption quite well satisfied --- # Two independent sample `\(t\)` test Carry out two independent sample `\(t\)` test (data on log scale) <table class="table" style="font-size: 18px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> contrast </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> SE </th> <th style="text-align:right;"> df </th> <th style="text-align:right;"> t.ratio </th> <th style="text-align:right;"> p.value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Control - Expert </td> <td style="text-align:right;"> 0.025 </td> <td style="text-align:right;"> 0.126 </td> <td style="text-align:right;"> 58 </td> <td style="text-align:right;"> 0.194 </td> <td style="text-align:right;"> 0.847 </td> </tr> </tbody> </table> *Conclusion* - There is no evidence against equal group means (p = 0.85). *Note* You can carry out the `\(t\)` test as a regression model. It is a special case of regression. (In SPSS, "General Linear Model"; in R, "lm") --- # Two independent sample `\(t\)` test Using the better model gives smaller estimated standard errors *Raw data* <table class="table" style="font-size: 18px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Condition </th> <th style="text-align:right;"> Arithmetric.mean </th> <th style="text-align:right;"> SE </th> <th style="text-align:right;"> df </th> <th style="text-align:right;"> lower.CL </th> <th style="text-align:right;"> upper.CL </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Control </td> <td style="text-align:right;"> 59.73 </td> <td style="text-align:right;"> 5.83 </td> <td style="text-align:right;"> 58 </td> <td style="text-align:right;"> 48.06 </td> <td style="text-align:right;"> 71.40 </td> </tr> <tr> <td style="text-align:left;"> Expert </td> <td style="text-align:right;"> 59.47 </td> <td style="text-align:right;"> 5.83 </td> <td style="text-align:right;"> 58 </td> <td style="text-align:right;"> 47.80 </td> <td style="text-align:right;"> 71.15 </td> </tr> </tbody> </table> *Log-transformed data* <table class="table" style="font-size: 18px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Condition </th> <th style="text-align:right;"> Geometric.mean </th> <th style="text-align:right;"> SE </th> <th style="text-align:right;"> df </th> <th style="text-align:right;"> lower.CL </th> <th style="text-align:right;"> upper.CL </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Control </td> <td style="text-align:right;"> 53.47 </td> <td style="text-align:right;"> 4.78 </td> <td style="text-align:right;"> 58 </td> <td style="text-align:right;"> 44.72 </td> <td style="text-align:right;"> 63.95 </td> </tr> <tr> <td style="text-align:left;"> Expert </td> <td style="text-align:right;"> 52.18 </td> <td style="text-align:right;"> 4.66 </td> <td style="text-align:right;"> 58 </td> <td style="text-align:right;"> 43.63 </td> <td style="text-align:right;"> 62.40 </td> </tr> </tbody> </table> --- # More than two independent groups **Is knot-tying time associated with observational learning? Compare the control group against each other group.** *Outcome:* Time (s) to tie a knot - continuous *Predictor:* Training condition (4 groups - control, novice, mixed, expert) *Data structure:* Independent *Model:* Analysis of variance - normal distribution assumption Multiple two-sample `\(t\)` tests? - No! --- # More than two independent groups Model equation for ANOVA model `$$\mathrm{Time} = \beta_0 + \beta_1 \, X_{\mathrm{Expert}} + \beta_2 \, X_{\mathrm{Mixed}} + \beta_3 \, X_{\mathrm{Novice}}$$` The *X* variables take the value 0 or 1 to show which group an observation is in. They are "indicator variables" or "dummy variables" --- # ANOVA: More than 2 independent groups *Assumptions* - Independence - data values are independent - Constant variance - residuals have constant variance - assess with residuals vs. fitted values plot - Normal distribution - residuals are normally distributed - assess using normal Q-Q plot Residual in ANOVA = observed data value - group mean --- # ANOVA: More than 2 independent groups .pull-left[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-27-1.png" style="display: block; margin: auto;" /> ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-28-1.png" style="display: block; margin: auto;" /> ] --- # ANOVA: More than 2 independent groups ``` ## Analysis of Variance Table ## ## Response: logKTTime ## Df Sum Sq Mean Sq F value Pr(>F) ## Condition2 3 0.0909 0.030299 0.1194 0.9486 ## Residuals 116 29.4433 0.253821 ``` *Conclusion* There is no evidence that knot-tying time is associated with observational learning (p = 0.95). --- # ANOVA: More than 2 independent groups - As *p* is large, no comparisons with control would be made - If *p* were small, those comparisons would be made and P values would need to be adjusted for multiple comparisons <table class="table" style="font-size: 18px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> contrast </th> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> SE </th> <th style="text-align:right;"> p.raw </th> <th style="text-align:right;"> p.dunnett </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Expert - Control </td> <td style="text-align:right;"> -0.02 </td> <td style="text-align:right;"> 0.13 </td> <td style="text-align:right;"> 0.85 </td> <td style="text-align:right;"> 0.99 </td> </tr> <tr> <td style="text-align:left;"> Mixed - Control </td> <td style="text-align:right;"> 0.03 </td> <td style="text-align:right;"> 0.13 </td> <td style="text-align:right;"> 0.79 </td> <td style="text-align:right;"> 0.97 </td> </tr> <tr> <td style="text-align:left;"> Novice - Control </td> <td style="text-align:right;"> -0.04 </td> <td style="text-align:right;"> 0.13 </td> <td style="text-align:right;"> 0.77 </td> <td style="text-align:right;"> 0.96 </td> </tr> </tbody> </table> - There are particular methods for adjusting P values for specific types of multiple comparisons --- # ANOVA: More than 2 independent groups Model equation (ANOVA is a special case of regression) <table class="table" style="font-size: 18px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Estimate </th> <th style="text-align:right;"> SE </th> <th style="text-align:right;"> t.value </th> <th style="text-align:right;"> p.value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:right;"> 3.979 </td> <td style="text-align:right;"> 0.092 </td> <td style="text-align:right;"> 43.261 </td> <td style="text-align:right;"> 0.000 </td> </tr> <tr> <td style="text-align:left;"> Condition2Expert </td> <td style="text-align:right;"> -0.025 </td> <td style="text-align:right;"> 0.130 </td> <td style="text-align:right;"> -0.189 </td> <td style="text-align:right;"> 0.851 </td> </tr> <tr> <td style="text-align:left;"> Condition2Mixed </td> <td style="text-align:right;"> 0.034 </td> <td style="text-align:right;"> 0.130 </td> <td style="text-align:right;"> 0.261 </td> <td style="text-align:right;"> 0.795 </td> </tr> <tr> <td style="text-align:left;"> Condition2Novice </td> <td style="text-align:right;"> -0.039 </td> <td style="text-align:right;"> 0.130 </td> <td style="text-align:right;"> -0.297 </td> <td style="text-align:right;"> 0.767 </td> </tr> </tbody> </table> `$$\mathrm{Time} = \beta_0 + \beta_1 \, X_{\mathrm{Expert}} + \beta_2 \, X_{\mathrm{Mixed}} + \beta_3 \, X_{\mathrm{Novice}}$$` <table class="table" style="font-size: 18px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Intercept (Control) </th> <th style="text-align:right;"> Expert </th> <th style="text-align:right;"> Mixed </th> <th style="text-align:right;"> Novice </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Estimate </td> <td style="text-align:right;"> 3.979 </td> <td style="text-align:right;"> -0.025 </td> <td style="text-align:right;"> 0.034 </td> <td style="text-align:right;"> -0.039 </td> </tr> <tr> <td style="text-align:left;"> Mean </td> <td style="text-align:right;"> 3.979 </td> <td style="text-align:right;"> 3.955 </td> <td style="text-align:right;"> 4.013 </td> <td style="text-align:right;"> 3.941 </td> </tr> </tbody> </table> --- # Mixed model: repeated measurements **Proficiency in instrument control: Is smooth instrument movement related to observational learning over time?** *Outcome:* Mean jerk (change in acceleration) - continuous *Predictors:* - Time (baseline, post-observation, retention) - categorical - Training condition (control, novice, mixed, expert) - categorical - Time-Condition interaction *Data structure:* Non-independent *Model:* Linear mixed model - normal distribution assumption --- # Mixed model: repeated measurements *Model equation* Mean jerk = overall mean + time effect + condition effect + time:condition effect + (subject effect) - time: 2 dummy variables - condition: 3 dummy variables - time-condition: 6 dummy variables - subject: accounts for grouping of data values by subject (i.e. non-independence) --- # Mixed model: repeated measurements .pull-left[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-36-1.png" style="display: block; margin: auto;" /> ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-37-1.png" style="display: block; margin: auto;" /> ] --- # Mixed model: repeated measurements <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-39-1.png" style="display: block; margin: auto;" /> --- # Mixed model: repeated measurements .pull-left[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-42-1.png" style="display: block; margin: auto;" /> ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-43-1.png" style="display: block; margin: auto;" /> ] Some improvement in stabilising variance from log transformation --- # Mixed model: repeated measurements .pull-left[ Some deviation from normal distribution evident ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-45-1.png" style="display: block; margin: auto;" /> ] --- # Mixed model: repeated measurements <table class="table" style="font-size: 18px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> numDF </th> <th style="text-align:right;"> denDF </th> <th style="text-align:right;"> F-value </th> <th style="text-align:right;"> p-value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Condition </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 116 </td> <td style="text-align:right;"> 1.232 </td> <td style="text-align:right;"> 0.301 </td> </tr> <tr> <td style="text-align:left;"> Time </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 228 </td> <td style="text-align:right;"> 10.784 </td> <td style="text-align:right;"> 0.000 </td> </tr> <tr> <td style="text-align:left;"> Condition:Time </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 228 </td> <td style="text-align:right;"> 0.600 </td> <td style="text-align:right;"> 0.730 </td> </tr> </tbody> </table> Condition-Time interaction is not an active predictor (p = 0.73) <table class="table" style="font-size: 18px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> numDF </th> <th style="text-align:right;"> denDF </th> <th style="text-align:right;"> F-value </th> <th style="text-align:right;"> p-value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Condition </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 116 </td> <td style="text-align:right;"> 1.229 </td> <td style="text-align:right;"> 0.302 </td> </tr> <tr> <td style="text-align:left;"> Time </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 234 </td> <td style="text-align:right;"> 10.910 </td> <td style="text-align:right;"> 0.000 </td> </tr> </tbody> </table> Training regime is not associated with mean jerk (p = 0.30) --- # Mixed model: repeated measurements .pull-left[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-53-1.png" style="display: block; margin: auto;" /> ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-54-1.png" style="display: block; margin: auto;" /> ] Training had no apparent impact as task was simple (Harris et al.). --- # Linear model: linear regression **Proficiency in instrument control: Is smooth instrument movement related to error rate in a ring-carrying exercise?** *Outcome:* No. errors per sec. (baseline) - continuous *Predictor:* Mean jerk (change in acceleration) - continuous *Data structure:* Independent *Model:* Linear regression - normal distribution assumption `$$\mathrm{ErrorRate} = \beta_0 + \beta_1 \; \mathrm{MeanJerk}$$` --- # Linear model: linear regression Data <table class="table" style="font-size: 18px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;"> Participant </th> <th style="text-align:right;"> ERRORS.SEC_RT1 </th> <th style="text-align:right;"> RT1Errors </th> <th style="text-align:right;"> RT1Time </th> <th style="text-align:right;"> MeanJerk1 </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0.060 </td> <td style="text-align:right;"> 9 </td> <td style="text-align:right;"> 149.37 </td> <td style="text-align:right;"> 0.044 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 0.150 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 46.67 </td> <td style="text-align:right;"> 0.073 </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 0.094 </td> <td style="text-align:right;"> 11 </td> <td style="text-align:right;"> 116.52 </td> <td style="text-align:right;"> 0.041 </td> </tr> <tr> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 0.198 </td> <td style="text-align:right;"> 10 </td> <td style="text-align:right;"> 50.62 </td> <td style="text-align:right;"> 0.068 </td> </tr> <tr> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 0.127 </td> <td style="text-align:right;"> 12 </td> <td style="text-align:right;"> 94.35 </td> <td style="text-align:right;"> 0.048 </td> </tr> <tr> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 0.161 </td> <td style="text-align:right;"> 15 </td> <td style="text-align:right;"> 92.97 </td> <td style="text-align:right;"> 0.048 </td> </tr> </tbody> </table> --- # Linear model: linear regression .pull-left[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-56-1.png" style="display: block; margin: auto;" /> Some positive skewness ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-57-1.png" style="display: block; margin: auto;" /> ] --- # Linear model: linear regression .pull-left[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-58-1.png" style="display: block; margin: auto;" /> ] .pull-right[ - Linear relationship - *Variability* of error rate increases with mean jerk ] --- # Linear model: linear regression ``` ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -0.102935 0.024127 -4.2663 4.077e-05 ## MeanJerk1 3.990533 0.419750 9.5069 3.397e-16 ## ## n = 118, p = 2, Residual SE = 0.06182, R-Squared = 0.44 ``` Can we accept this model as valid? Check the assumptions! *Assumptions* - Residuals have constant variance - Residuals are normally distributed --- # Linear model: linear regression .pull-left[ Assumption 1: Residuals have constant variance - Fanning pattern suggests variance not constant ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-60-1.png" style="display: block; margin: auto;" /> ] --- # Linear model: linear regression .pull-left[ Assumption 2: Residuals normally distributed - No gross deviation apparent ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-61-1.png" style="display: block; margin: auto;" /> ] --- # Linear model: linear regression .pull-left[ Address non-constant variance - Log-transforming positively skewed outcome may be useful - Log transformation maybe a little too strong ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-63-1.png" style="display: block; margin: auto;" /> ] --- # Linear model: linear regression .pull-left[ Log transformation has stabilised variance to some degree ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-64-1.png" style="display: block; margin: auto;" /> ] --- # Linear model: linear regression Model with log-transformed response variable ``` ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -4.31634 0.23699 -18.2134 < 2.2e-16 ## MeanJerk1 34.78251 4.12292 8.4364 1.051e-13 ## ## n = 118, p = 2, Residual SE = 0.60724, R-Squared = 0.38 ``` --- # Generalised linear model - Poisson But "errors per sec." is really a rate ... - different types of regression model for different types of outcome variable - count over given time intervals - Poisson distribution (integer values, 0 is a possible value, no upper limit) - *generalised* linear model (non-normal outcome variable) --- # Generalised linear model - Poisson ``` ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) -4.0291 0.1374 -29.323 < 2.2e-16 ## MeanJerk1 32.1147 2.3108 13.897 < 2.2e-16 ## ## n = 118 p = 2 ## Deviance = 215.46990 Null Deviance = 387.82392 (Difference = 172.35402) ``` *Model equation* `$$\log{Y} = \beta_0 + \beta_1 X$$` $$\log{\mathrm{(No. of \; errors)}} = \beta_0 + \beta_1 \mathrm{MeanJerk} + \log{\mathrm{(Time \; period)}} $$ Must add "time period" adjustment, as each participant's time to complete was different --- # Generalised linear model - Poisson Check the assumptions! *Assumptions* - Quantile residuals have constant variance (constant dispersion) - Quantile residuals are normally distributed - Overdispersion not present --- # Generalised linear model - Poisson .pull-left[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-67-1.png" style="display: block; margin: auto;" /> ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-68-1.png" style="display: block; margin: auto;" /> ] Constant var. and normal dist. assumptions are quite well satisfied --- # Errors per sec. - which model? .pull-left[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-70-1.png" style="display: block; margin: auto;" /> ] .pull-right[ <img src="data:image/png;base64,#stats_surgical_rsch_soc_2021-07-29_files/figure-html/unnamed-chunk-71-1.png" style="display: block; margin: auto;" /> ] Poisson: maybe better inferences - overdispersion not yet checked! --- # References